58 research outputs found
Dialog State Tracking with Reinforced Data Augmentation
Neural dialog state trackers are generally limited due to the lack of
quantity and diversity of annotated training data. In this paper, we address
this difficulty by proposing a reinforcement learning (RL) based framework for
data augmentation that can generate high-quality data to improve the neural
state tracker. Specifically, we introduce a novel contextual bandit generator
to learn fine-grained augmentation policies that can generate new effective
instances by choosing suitable replacements for the specific context. Moreover,
by alternately learning between the generator and the state tracker, we can
keep refining the generative policies to generate more high-quality training
data for neural state tracker. Experimental results on the WoZ and MultiWoZ
(restaurant) datasets demonstrate that the proposed framework significantly
improves the performance over the state-of-the-art models, especially with
limited training data.Comment: AAAI 202
AutoConv: Automatically Generating Information-seeking Conversations with Large Language Models
Information-seeking conversation, which aims to help users gather information
through conversation, has achieved great progress in recent years. However, the
research is still stymied by the scarcity of training data. To alleviate this
problem, we propose AutoConv for synthetic conversation generation, which takes
advantage of the few-shot learning ability and generation capacity of large
language models (LLM). Specifically, we formulate the conversation generation
problem as a language modeling task, then finetune an LLM with a few human
conversations to capture the characteristics of the information-seeking process
and use it for generating synthetic conversations with high quality.
Experimental results on two frequently-used datasets verify that AutoConv has
substantial improvements over strong baselines and alleviates the dependence on
human annotation. In addition, we also provide several analysis studies to
promote future research.Comment: Accepted to ACL 2023 Main Conference (Short
NewsDialogues: Towards Proactive News Grounded Conversation
Hot news is one of the most popular topics in daily conversations. However,
news grounded conversation has long been stymied by the lack of well-designed
task definition and scarce data. In this paper, we propose a novel task,
Proactive News Grounded Conversation, in which a dialogue system can
proactively lead the conversation based on some key topics of the news. In
addition, both information-seeking and chit-chat scenarios are included
realistically, where the user may ask a series of questions about the news
details or express their opinions and be eager to chat. To further develop this
novel task, we collect a human-to-human Chinese dialogue dataset
\ts{NewsDialogues}, which includes 1K conversations with a total of 14.6K
utterances and detailed annotations for target topics and knowledge spans.
Furthermore, we propose a method named Predict-Generate-Rank, consisting of a
generator for grounded knowledge prediction and response generation, and a
ranker for the ranking of multiple responses to alleviate the exposure bias. We
conduct comprehensive experiments to demonstrate the effectiveness of the
proposed method and further present several key findings and challenges to
prompt future research.Comment: Accepted to ACL 2023 Conference (Long Paper; Findings
FIMO: A Challenge Formal Dataset for Automated Theorem Proving
We present FIMO, an innovative dataset comprising formal mathematical problem
statements sourced from the International Mathematical Olympiad (IMO)
Shortlisted Problems. Designed to facilitate advanced automated theorem proving
at the IMO level, FIMO is currently tailored for the Lean formal language. It
comprises 149 formal problem statements, accompanied by both informal problem
descriptions and their corresponding LaTeX-based informal proofs. Through
initial experiments involving GPT-4, our findings underscore the existing
limitations in current methodologies, indicating a substantial journey ahead
before achieving satisfactory IMO-level automated theorem proving outcomes
- …